Adaptive Regression Kernels for Image/Video Restoration and Recognition
نویسنده
چکیده
I will described a nonparametric framework for locally adaptive signal processing and analysis. This framework is based upon the notion of Kernel Regression which we generalize to adapt to local characteristics of the given data, resulting in descriptors which take into account both the spatial density of the samples (”the geometry”), and the actual values of those samples (”the radiometry”). These descriptors are exceedingly robust in capturing the underlying structure of the signals even in the presence of significant noise, missing data, and other disturbances. As the framework does not rely upon strong assumptions about noise or signal models, it is applicable to a wide variety of problems. On the processing side, I will illustrate examples in two and three dimensions including state of the art denoising, upscaling, and deblurring. On the analysis side, I will describe the application of the framework to training-free object detection in images, and action detection in video, from a single example. c ©Optical Society of America OCIS Codes:100.2000 (Digital Image Processing); 100.3008 (Image recognition, algorithms and filters ) 1 Steering Kernel Regression (SKR) We first review the fundamental framework of kernel regression [3] and then describe its novel extension, the steering kernel regression (SKR), in 2-D. The extension to 3-D is straightforward. The KR framework defines its data model as yi = z(xi) + εi, i = 1, · · · , P, xi = [x1i, x2i] T , (1) where yi is a noisy sample at xi (Note: x1i and x2i are spatial coordinates), z(·) is the (hitherto unspecified) regression function to be estimated, εi is an i.i.d. zero mean noise, and P is the total number of samples in an arbitrary ”window” around a position x of interest. As such, the kernel regression framework provides a rich mechanism for computing point-wise estimates of the regression function with minimal assumptions about global signal or noise models. While the particular form of z(·) may remain unspecified, we can develop a generic local expansion of the function about a sampling point xi. Specifically, if x is near the sample at xi, we have the N -th order Taylor series z(xi) = z(x) + {∇z(x)} T (xi − x) + 1 2 (xi − x) T {Hz(x)} (xi − x) + · · · = β0 + β T 1 (xi − x) + β T 2 vech { (xi − x)(xi − x) T } + · · · (2) where ∇ and H are the gradient (2×1) and Hessian (2×2) operators, respectively, and vech(·) is the half-vectorization operator that lexicographically orders the lower triangular portion of a symmetric matrix into a column-stack vector. Furthermore, β0 is z(x), which is the signal (or pixel) value of interest, and the vectors β1 and β2 are β 1 = [ ∂z(x) ∂x1 , ∂z(x) ∂x2 ]T , β 2 = 1 2 [ ∂z(x) ∂x 1 , 2 ∂z(x) ∂x1∂x2 , ∂z(x) ∂x 2 ]T . (3) Since this approach is based on local signal representations, a logical step to take is to estimate the parameters {βn} N n=0 from all the neighboring samples {yi} P i=1 while giving the nearby samples higher weights than samples farther away. A (weighted) least-square formulation of the fitting problem capturing this idea is to solve the following optimization problem,
منابع مشابه
Convolutional Neural Pyramid for Image Processing
We propose a principled convolutional neural pyramid (CNP) framework for general low-level vision and image processing tasks. It is based on the essential finding that many applications require large receptive fields for structure understanding. But corresponding neural networks for regression either stack many layers or apply large kernels to achieve it, which is computationally very costly. O...
متن کاملImage Restoration with Two-Dimensional Adaptive Filter Algorithms
Two-dimensional (TD) adaptive filtering is a technique that can be applied to many image, and signal processing applications. This paper extends the one-dimensional adaptive filter algorithms to TD structures and the novel TD adaptive filters are established. Based on this extension, the TD variable step-size normalized least mean squares (TD-VSS-NLMS), the TD-VSS affine projection algorithms (...
متن کاملFace Detection at the Low Light Environments
Today, with the advancement of technology, the use of tools for extracting information from video are much wider in terms of both visual power and the processing power. High-speed car, perfect detection accuracy, business diversity in the fields of medical, home appliances, smart cars, humanoid robots, military systems and the commercialization makes these systems cost effective. Among the most...
متن کاملVideo-based face recognition in color space by graph-based discriminant analysis
Video-based face recognition has attracted significant attention in many applications such as media technology, network security, human-machine interfaces, and automatic access control system in the past decade. The usual way for face recognition is based upon the grayscale image produced by combining the three color component images. In this work, we consider grayscale image as well as color s...
متن کاملAdaptive Spectral Separation Two Layer Coding with Error Concealment for Cell Loss Resilience
This paper addresses the issue of cell loss and its consequent effect on video quality in a packet video system, and examines possible compensative measures. In the system's enconder, adaptive spectral separation is used to develop a two-layer coding scheme comprising a high priority layer to carry essential video data and a low priority layer with data to enhance the video image. A two-step er...
متن کامل